{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "856cb409",
   "metadata": {},
   "source": [
    "# Worksheet 1 - Basic Read Queries"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "355b2b35",
   "metadata": {},
   "source": [
    "### Exercise 1: Find the first name of Dave's friends\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find a `person` node(s) with a `first_name` of \"Dave\"\n",
    "* Find the friends of Dave (i.e. traverse the `friends` edge)\n",
    "* Return the friends `first_name`\n",
    "\n",
    "The correct answer is four results: \"Jim\", \"Josh\", \"Hank\", \"Kelly\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c6c741d",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().hasLabel('person')\n",
    ".has('first_name','Dave')\n",
    ".out('friends')\n",
    ".values('first_name')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "76bbc174",
   "metadata": {},
   "source": [
    "### Exercise 2: Find the first name of the friends of Dave's friends\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find a `person` node(s) with a `first_name` of \"Dave\"\n",
    "* Find the friends of Dave (i.e. traverse the `friends` edge)\n",
    "* Find the friends of that person (i.e. traverse the `friends` edge)\n",
    "* Return the friends `first_name`\n",
    "\n",
    "The correct answer contains three results: \"Hank\", \"Denise\", \"Paras\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6680e1d4",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().hasLabel('person')\n",
    ".has('first_name','Dave')\n",
    ".out('friends')\n",
    ".out('friends')\n",
    ".dedup()\n",
    ".values('first_name')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "886f9699",
   "metadata": {},
   "source": [
    "### Exercise 3: Find out how the friends of Dave's friends are connected\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find a `person` node(s) with a `first_name` of \"Dave\"\n",
    "* Find the friends of Dave (i.e. traverse the `friends` edge)\n",
    "* Find the friends of that person (i.e. traverse the `friends` edge)\n",
    "* Return the path\n",
    "\n",
    "The correct answer contains three results:\n",
    "\n",
    "- `Dave` -> `Josh` -> `Hank`\n",
    "- `Dave` -> `Kelly` -> `Denise`\n",
    "- `Dave` -> `Jim` -> `Paras`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d4191ccb",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().hasLabel('person')\n",
    ".has('first_name','Dave')\n",
    ".out('friends')\n",
    ".out('friends')\n",
    ".dedup()\n",
    ".path()\n",
    ".by(elementMap())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2042c3f5",
   "metadata": {},
   "source": [
    "### Exercise 4: Which friends should we recommend for Dave?\n",
    "\n",
    "A common use case for graphs in social networks is to recommend new connections. There is a significant amount of research in this area (example [here](https://www.science.org/doi/10.1126/sciadv.aax7310#:~:text=The%20triadic%20closure%20mechanism%20uses,features%20of%20empirical%20social%20networks)) but mainly there are two prevailing mechanisms at work in social networks that we can leverage to help provide efficient recommendations to a user.  The first of these mechanisms is called homophily, which is the tendency of similar people to be connected.  Homophily is a driving factor in many social networks, with an important outcome being that people connected to you, or connected to people that are connected to you, tend to be similar to you.  This leads to the second mechanism in a graph, the concept of a triadic closure.  Triadic closure is a way to create or recommend new connections based on common friends or acquaintances.  \n",
    "\n",
    "\n",
    "In this exercise, we are going to leverage triadic closure to recommend friends for Dave.  To accomplish this, we will need to leverage the previously written queries but extend them to:\n",
    "\n",
    "* Find all the friends of friends that do not have a connection to Dave\n",
    "\n",
    "The correct answer contains two results: \"Denise\", \"Paras\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1a068ca6",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().hasLabel('person')\n",
    ".has('first_name','Dave').as('dave')\n",
    ".out('friends')\n",
    ".out('friends')\n",
    ".where(neq('dave'))\n",
    ".dedup()\n",
    ".values('first_name')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e8077f9",
   "metadata": {},
   "source": [
    "# Worksheet 2 - Loops and Repeat Queries"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d0551517",
   "metadata": {},
   "source": [
    "### Exercise 1: Find the friends of Dave's Friends using a loop.\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find a `person` node(s) with a `first_name` of \"Dave\"\n",
    "* Find the friends of Dave (i.e. traverse the `friends` edge)\n",
    "* Find the friends of that person (i.e. traverse the `friends` edge)\n",
    "* Return the friends `first_name`\n",
    "\n",
    "The correct answer is three results: \"Hank\", \"Denise\", \"Paras\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e2ddde36",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().hasLabel('person')\n",
    ".has('first_name','Dave')\n",
    ".repeat(\n",
    "    out('friends')\n",
    "    .simplePath()\n",
    ")\n",
    ".times(2)\n",
    ".dedup()\n",
    ".values('first_name')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84d8182d",
   "metadata": {},
   "source": [
    "### Exercise 2: Find all `person` nodes connected to Dave.\n",
    "\n",
    "Starting at a single node and trying to find all connected children (a.k.a. root to leaf) or trying to find the parent of any child node (a.k.a leaf to root) are two very common hierarchical graph query patterns.  Commonly, these queries supported bill of materials, information organization, or compliance use cases.\n",
    "\n",
    "In this exercise, we will be applying that same query pattern to find the hierarchy of people within our social network.  We'll accomplish this by writing a \"root to leaf\" type query where the root node is our `Dave` node in the social network.\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find a `person` node(s) with a `first_name` of \"Dave\"\n",
    "* Keep traversing the outgoing `friends` edge until there are no more outgoing `friends` edges\n",
    "* Return all the paths\n",
    "\n",
    "The correct answer has 5 results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6367292c",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().hasLabel('person')\n",
    ".has('first_name','Dave')\n",
    ".repeat(\n",
    "    out('friends')\n",
    ")\n",
    ".until(not(out('friends')))\n",
    ".path()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a6fba6ab",
   "metadata": {},
   "source": [
    "### Exercise 3: Find all the ways Dave and Denise are connected.\n",
    "\n",
    "A common extension to the path traversal query we wrote in Loop-3 is to return not just \"if\" someone is connected but \"how\" they are connected.\n",
    "\n",
    "In this exercise, we will be making a slight modification to the previous query to return \"how\" Dave and Denise are connected, not just that they are.\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find a `person` node(s) with a `first_name` of \"Dave\"\n",
    "* Find the friends of Dave (i.e. traverse the `friends` edge)\n",
    "* Keep traversing the `friends` edge until you find `Denise`\n",
    "* Return the path\n",
    "\n",
    "The correct answer has 3 results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fa0b467b",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().hasLabel('person')\n",
    ".has('first_name','Dave')\n",
    ".repeat(\n",
    "    out('friends')\n",
    "    .simplePath()\n",
    ")\n",
    ".until(\n",
    "    has('first_name','Denise')\n",
    ")\n",
    ".path()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c3e24581",
   "metadata": {},
   "source": [
    "# Worksheet 3 - Ordering, Functions, and Grouping"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b033e716",
   "metadata": {},
   "source": [
    "### Exercise 1: What are the 3 highest rated restaurants?\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find the 3 highest average restaurant rating\n",
    "* Find the associated `cuisine`\n",
    "* Return the restaurant name, the cuisine name, and the average rating\n",
    "* Order the results by average rating descending\n",
    "\n",
    "The results for this query are:\n",
    "\n",
    "|Restaurant name|Cuisine|Avg Rating|\n",
    "|---|---|---|\n",
    "|Lonely Grape|bar|5.0|\n",
    "|Perryman's|bar|4.5|\n",
    "|Rare Bull|steakhouse|4.333333|"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d3004eb3",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V()\n",
    ".hasLabel('cuisine')\n",
    ".in('serves')\n",
    ".group()\n",
    ".by(identity())\n",
    ".by(in('about').values('rating').mean())\n",
    ".unfold()\n",
    ".order()\n",
    ".by(values, desc)\n",
    ".limit(3)\n",
    ".unfold()\n",
    ".project('restaurant name','cuisine','avg rating')\n",
    ".by(select(keys).values('name'))\n",
    ".by(select(keys).out('serves').values('name'))\n",
    ".by(select(values))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "854c9109",
   "metadata": {},
   "source": [
    "### Exercise 2: Find the top 3 highest rated restaurants in the city where Dave lives.\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find a `person` node(s) with a `first_name` of \"Dave\"\n",
    "* Find the `city` that Dave lives in\n",
    "* Find the average rating of restaurants in that city\n",
    "* Find the top 3 average ratings\n",
    "* Return the restaurant name, address, and average rating\n",
    "* Order by the average rating descending\n",
    "\n",
    "The results for this query are:\n",
    "\n",
    "|Restaurant name|Address|Avg Rating|\n",
    "|---|---|---|\n",
    "|Dave's Big Deluxe|\t490 Ivan Cape|4.0|\n",
    "|Pick & Go|4881 Upton Falls|3.75|\n",
    "|Without Chaser|\t01511 Casper Fall|3.5|"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ea31dd30",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().has('person','first_name','Dave')\n",
    ".out('lives')\n",
    ".in('within')\n",
    ".where(inE('about'))\n",
    ".group()\n",
    ".by(identity())\n",
    ".by(in('about').values('rating').mean())\n",
    ".unfold()\n",
    ".order()\n",
    ".by(values,desc)\n",
    ".limit(3)\n",
    ".unfold()\n",
    ".project('restaurant name','address','avg rating')\n",
    ".by(select(keys).values('name'))\n",
    ".by(select(keys).values('address'))\n",
    ".by(select(values))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "825631b7",
   "metadata": {},
   "source": [
    "### Exercise 3: Which Mexican or Chinese restaurant near Dave is the highest rated?\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find a `person` node(s) with a `first_name` of \"Dave\"\n",
    "* Find the `city` that Dave lives in\n",
    "* Find the restaurants in that city that serve 'Mexican' or 'Chinese' food\n",
    "* Find the average rating of those restaurants\n",
    "* Return the restaurant name, address, and average rating\n",
    "* Order by the average rating descending\n",
    "* Return the top 1 result\n",
    "\n",
    "The results for this query are:\n",
    "\n",
    "|Restaurant name|Address|Avg Rating|\n",
    "|---|---|---|\n",
    "|With Salsa|24320 Williamson Causeway|3.5|"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3cd2eae2",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().has('person','first_name','Dave')\n",
    ".out('lives')\n",
    ".in('within')\n",
    ".where(out('serves').has('name',within('Mexican','Chinese')))\n",
    ".where(inE('about'))\n",
    ".group()\n",
    ".by(identity())\n",
    ".by(in('about').values('rating').mean())\n",
    ".unfold()\n",
    ".order()\n",
    ".by(values,desc)\n",
    ".limit(1)\n",
    ".unfold()\n",
    ".project('restaurant name','address','avg rating')\n",
    ".by(select(keys).values('name'))\n",
    ".by(select(keys).values('address'))\n",
    ".by(select(values))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a195b2d2",
   "metadata": {},
   "source": [
    "### Exercise 4: What are the top 3 restaurants, recommended by his friends, where Dave lives? (Personalized Recommendation)\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find a `person` node(s) with a `first_name` of \"Dave\"\n",
    "* Find the `city` that Dave lives in\n",
    "* Find Dave's friends\n",
    "* Find reviews written by Dave's friends in the city \"Dave\" lives in\n",
    "* Find the average rating of those restaurants\n",
    "* Return the restaurant name, address, and average rating\n",
    "* Order by the average rating descending\n",
    "* Return the top 3\n",
    "\n",
    "The results for this query are:\n",
    "\n",
    "|Restaurant name|Address|Avg Rating|\n",
    "|---|---|---|\n",
    "|Dave's Big Deluxe|490 Ivan Cape|4.0|\n",
    "|With Salsa|24320 Williamson Causeway|4.0|\n",
    "|Satiated|370 Hills Estates|3.666667|"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2b8b31f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().has('person','first_name','Dave').as('dave')\n",
    ".out('lives')\n",
    ".in('within')\n",
    ".where(in('about').in('wrote').both('friends').where(eq('dave')))\n",
    ".group()\n",
    ".by(identity())\n",
    ".by(in('about').values('rating').mean())\n",
    ".unfold()\n",
    ".order()\n",
    ".by(values,desc)\n",
    ".limit(3)\n",
    ".unfold()\n",
    ".project('restaurant name','address','avg rating')\n",
    ".by(select(keys).values('name'))\n",
    ".by(select(keys).values('address'))\n",
    ".by(select(values))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "897b88c4",
   "metadata": {},
   "source": [
    "# Worksheet 4 - Create, Update and Delete Queries"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d71b8ed",
   "metadata": {},
   "source": [
    "### Exercise 1: Create a new person `Leonhard Euler`  and connect them to `Dave`.\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Create a new `person` node with a name of `Leonhard Euler` \n",
    "* Connect the new node to \"Dave\" via a `friends` edge\n",
    "* Return the new connection\n",
    "\n",
    "The results for this query is ID of the new edge"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "49e157a8",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.addV('person').property('name','Leonhard Euler')\n",
    " .addE('friends').to(__.V().has('person','first_name','Dave'))\n",
    " \n",
    "//OR\n",
    "\n",
    "//g\n",
    "// .mergeV([(T.id):'leo', (T.label):'person', name: 'Leonhard Euler')\n",
    "// .mergeE([(T.label):'friends',(from):Merge.outV,(to):Merge.inV])\n",
    "//    .option(Merge.outV, [(T.label): 'person', name: 'Leonhard Euler'])\n",
    "//    .option(Merge.inV, [(T.label): 'person', first_name: 'Dave', last_name: 'Bech'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8abdb992",
   "metadata": {},
   "source": [
    "### Exercise 2: Upsert a list of followers and add an edge to `Dave`.\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Given the following list:\n",
    "    ```\n",
    "    [{first_name: 'Taylor', last_name: 'Hall'},\n",
    "    {first_name: 'Kelvin', last_name: 'Fernsby'},\n",
    "    {first_name: 'Ian', last_name: 'Rochester'}]\n",
    "    ```\n",
    "* Add or update `person` nodes for each item in the list\n",
    "* Add or update a `follows` relationship between each new node and \"Dave\"\n",
    "* If the edge is created write a property `creation` with a value `Created`\n",
    "* If the edge already exists write a property `creation` with a value `Updated`\n",
    "* Return the new edge elements\n",
    "* This query should be re-runnable without creating new nodes or edges\n",
    "\n",
    "The results for this query are the three edge elements"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4d82cd3e",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.V().hasLabel(\"person\")\n",
    ".has(\"first_name\",\"Taylor\").has(\"last_name\",\"Hall\")\n",
    ".fold().coalesce(unfold(),addV('person').property('first_name','Taylor').property('last_name','Hall'))\n",
    ".V().hasLabel(\"person\")\n",
    " .has(\"first_name\",\"Kelvin\").has(\"last_name\",\"Fernsby\")\n",
    " .fold().coalesce(unfold(),addV('person').property('first_name','Kelvin').property('last_name','Fernsby'))\n",
    ".V().hasLabel(\"person\")\n",
    " .has(\"first_name\",\"Ian\").has(\"last_name\",\"Rochester\")\n",
    " .fold().coalesce(unfold(),addV('person').property('first_name','Ian').property('last_name','Rochester'))\n",
    "\n",
    ".V().hasLabel(\"person\")\n",
    " .has(\"first_name\",\"Taylor\").has(\"last_name\",\"Hall\")\n",
    " .outE('follows')\n",
    "     .where(inV().has('person','first_name','Dave'))\n",
    " .fold().coalesce(unfold().property('creation','updated'), \n",
    "     addE('follows').from(__.V().has('person','first_name','Taylor')).to(__.V().has('person','first_name','Dave')).property('creation','created')\n",
    " )\n",
    ".V().hasLabel(\"person\")\n",
    " .has(\"first_name\",\"Kelvin\").has(\"last_name\",\"Fernsby\")\n",
    " .outE('follows')\n",
    "     .where(inV().has('person','first_name','Dave'))\n",
    " .fold().coalesce(unfold().property('creation','updated'), \n",
    "     addE('follows').from(__.V().has('person','first_name','Kelvin')).to(__.V().has('person','first_name','Dave')).property('creation','created')\n",
    " )\n",
    ".V().hasLabel(\"person\")\n",
    " .has(\"first_name\",\"Ian\").has(\"last_name\",\"Rochester\")\n",
    " .outE('follows')\n",
    "     .where(inV().has('person','first_name','Dave'))\n",
    " .fold().coalesce(unfold().property('creation','updated'), \n",
    "     addE('follows').from(__.V().has('person','first_name','Ian')).to(__.V().has('person','first_name','Dave')).property('creation','created')\n",
    " )\n",
    ".V().hasLabel('person')\n",
    "    .outE('follows').elementMap()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9abe4c7e",
   "metadata": {},
   "source": [
    "### Exercise 3: Delete all `follows` edges and remove any connected nodes with no other edges.\n",
    "\n",
    "Using the data model above, write a query that will:\n",
    "\n",
    "* Find all the `follows` edges and connected nodes and remove the edges\n",
    "* For each of the connected nodes see if they have any other edges\n",
    "* If they have edges then ignore them\n",
    "* If they have no edges then remove them"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6a3ca71d",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin\n",
    "\n",
    "g.E().hasLabel('follows').aggregate('edges')\n",
    ".bothV()\n",
    ".hasLabel('person')\n",
    ".where(out().count().is(eq(1))).aggregate('nodes')\n",
    ".select('edges').unfold().drop()\n",
    ".select('nodes').unfold().drop()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
